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(54) t/O control apparatus having check recovery function 



(57) In a computer system, when a CPU (la. 1b) 
performs state setting of an operation mode or the like 
to IAD devices (4a. 4b), the log data of the state setting 
is stored in a set log storage area. Upon occurrence of 
a fault in the computer system, the I/O devices are 
cleared, and state setting of the 1/0 devices is per- 



formed on the basis of the log data of the state setting 
stored in the set iog storage area (34). Therefore, the 
states of the 1/0 devices can be recovered to a state at 
a ched^point when the process is restarted. 
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Description 

The present invention relates to an I/O control 
apparatus adapted Ao a computer system having a 
checkpoint recovery function. s 

In recent years, computer systems have considera- 
bly been developed. With this development, request to 
reliability such as coping with a fault has become stict 
As one method for constituting a fault tolerant computer 
system, there is a checkpoint recovery scheme. io 

According to a method for implementing a check- 
point recovery scheme, the internal state of a CPU, 
namely, the contents of the registers and the cache 
memory of a CPU are periodically saved in a main 
memory to acquire a checkpoint on the main memory, is 
When a data processing cannot be continued due to a 
fault in the computer system, the main memory is 
restored to the state of the most recent checkpoirrt and . 
the data processing is restarted using the internal state 
of the CPU stored in the main memory. 20 

A method for restoring the main memory to the 
state of the checkpoint is as follows. In an update oper- 
ation of a main memory, the address and data to be 
updated are stored in a memory state recovery unit 55. 
Upon occurrence of a fault in the computer system, the 2S 
main memory is written back with the before data stored 
in the memory state recovery unit 55. 

Though, in this checkpoint recovery scheme, upon 
occurrence of a fault in the computer system, the inter- 
nal state of the main memory or CPU can be restored to 30 
the state of the most recent checkpoint by using the 
memory state recovery unit 55, an 1/0 device connected 
to the computer system cannot be easily restored to the 
state of the most recent checkpoint 

This problem will be described below with reference 35 
toFIGS. 1 and 2. 

As shown in FIG. 1 , in this computer system, a CPU 

51 requests a disk controller 52 to access a disk 53 to 
perform an I/O operation. FIG. 2 shows a timing dia- 
gram of the 1/0 processing of the computer system hav- 40 
ing the above arrangement. 

As depicted in FIG. 2, registers of the disk controller 

52 are set to read data from a predetermined position of 
the disk 53 at times TO to Tl ((1 ) in FIG. 2), and the disk 
controller 52 is started at time Tl ((2) in FIG. 2). In this 45 
manner, the disk controller 52 and the disk 53 execute a 
read operation at times Tl to T2 ((3) in FIG. 2). The read 
data are transfen-ed into the main memory 54 by DMA 
transfer from the disk controller 52. 

The CPU 51 receives a conrpletion inten-upt from so 
the disk controller 52 at time T2 ((4) in FIG. 2), thereby 
performing a completion interrupt processing to the disk 
controller 52 at times T2 to T3 ((5) and (6) in FIG. 2). 
Another post processing with respect to the read opera- 
tion is performed at time T3 to T4 {(7) in FIG.. 2). ^ ss 

The first difficulty in this case is that a checkpoint 
acquired at an arbitrary timing is not always valid. 

For example, assume that a checkpoint is acquired 
in the middle of setting the resisters of the disk controller 



52 (the setup sequence between times TO and Tl .) 

In this case, upon occurrence of a fault of the com- 
puter thereafter, a tatter part of the setup sequence is 
re-performed from tine most recent checkpoint namely 
only a pjart of tiie registers of the disk controller 52 are 
" set again. For tiiis reason, the disk controller 52 does 
not always operate desirably. 

In consideration of the characteristics of the disk 
controller 52, not only at times TO to TV described 
above, but also at times TO to T3, i.e., when tiie CPU 51 
acquires a checkpoint during a setijp sequence for an 
I/O operation such as a read/write operation, the disk 
controller 52 does not always operate desirably when a 
latter part of the setup sequence is re-performed from 
tfie checkpoint after a fault occurs in tiie system. 

One method to solve the difficulty is tiiat a check- 
pointing . must not be pei^ormed during a setup 
sequence of an I/O device. However, in a computer sys- 
tem in which many I/O devices are incorporated, tiie 
CPU almost always performs setup sequence of an 1/0 
operation. Therefore, it may lead to a considerable per- 
formance degradation to prevent a checkpointing during 
a setijp sequence of an I/O device. 

The secorKi difficulty is as follows. Assume a fault 
occurs in the system during a DMA transfer from the 
disk controller 52 to the main mennory 54, In this case, 
ongoing DMA transfer must be stopped before the main 
memory 54 is restored to tiie state of the most recent 
checkpoint 

In a conventional conputer system, in order to stop 
ongoing DMA ti-ansfer, it is necessary to initialize (reset) . 
tiie I/O device. Since the I/O device is set in an initial 
state by inrtiaiizing the I/O device, a special p)rdcess is 
required to restore tiie I/O device to the state of the 

.most recent checkpoint. . 

As a^ scheme for solving the problem of an I/O 

, processing in tiie above checkpoint recovery scheme, 
the follovwng two schemes are known. ' 

The first scheme is disclosed in^ USP-4740969 
"METHOD AND APPARATUS FOR RECOVERING 
FROM HARDWARE FAULTS". In a normal data 
processing, tiie data of read/write of the registers of an 
I/O device, tiie' inten-upt from the I/O device are 
recorded in a log memory. When a register setup 
sequence is restarted from the most recent checkpoint 
after a fault occurs in tiie computer system, the 
read/write operations performed to the registers of "tiie 
I/O device before tiie fault occurs are re-performed as 
follows. For a write operation, the data is discarded and 
not written to tiie registers of tiie I/O device. For a read 
operation, instead of reading out from the register of tiie 
I/O unit, tiie data in the log memory is returned to tiie 
CPU. For an inten-upt from tiie I/O device, tiie interrupt 
is generated and sent to the CPU at tfie same timing as 

. in the-preceding execution. 

This scheme requires a special interface circuit 
which is not provided to an ordinary computer system. 
Moreover, it is difficult to apply tiiis scheme to a multi- 
processor system. 
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The second scheme is disclosed in Sequoia: A 
Fault-tolerant Tightly Coupled Multiprocessor for Trans- 
action Processing, IEEE Computer, February 1988. In 
this scheme, data processing in a computer system is 
divided into data processing portion which can be per* 5 
formed by using only CPUs and a main memory and an 
i/0 processing portion which handles I/O devices. 
These portions are executed by different computers. 

FIG. 3 shows the schematic arrangement of a com- 
puter system in which a data processing in the compu- 70 
ter system is divided into a portion performed by only 
access to a main memory and a portion including 
access to the I/O device, and the former is executed by 
a computer 100 whose reliab'iity is inrproved by the 
checkpoint recovery scheme, and the latter is executed is 
by a computer 200 wftich does not use the checkpoint 
recovery scheme. In the logical interface between these 
portions, a request representing "read the designated 
anx)unt of data at the designated position of the desig- 
nated disk" is sent from the computer. 1 00 to the compu- 20 
ter 200. When the computer 200 actually has read data, 
a termination code indicating whetiier the operation is 
normally completed or not and the data read from the 
disk are returned from the computer 200 to the compu- 
ter 100. 25 

To improve the reliability of the conrputer 200. the 
constituent elements of the computer 200 are dupli- 
cated. Namely the computer 200 consists of computer 
main bodies 210a and 210b and I/O devices 220a and 
220b. In a norrrtal state, the request is simultaneously 30 
processed on both sides, and the execution results are 
compared with each other to check whether the execu- 
tion results are identical. If a fault occurs on one side, 
the requested operation is continuously performed on 
the remaining sida 35 

This scheme has tiie following disadvantage. That 
is, since at least two types of computers must be pre- 
pared, the computer system is large and costly 

The following idea would be thought of from the 
second scheme. That is. the computer 100 and. the 40 
computer 200 may be implemented by one computer by 
using a virtual computer technology However, this idea 
does not work well because of the following reason. 

The scheme disclosed in Sequoia is based on the 
following assumption. Since the independent comput- 45 
ers 100 and 200 are used, even if the data processing of 
the computer 100 is restarted from a checkpoint due to 
occurrence of a fault within tiie computer 100, the I/O 
processing of computer 200 is not influenced by the 
' fault. so 

However, if the connputer 1 00 and tine computer 200 
were implemented on one computer by using the virtual 
computer technology, the computer 100 and the compu- 
ter 200 would be simultaneously influenced by a fault 
occurring in the base computer system. ss 

As described atx)ve, a checkpoint recovery compu- 
ter system needs a special treatment of the I/O process- 
ing portion. A method of arranging a special interface 
between the CPU and the I/O device, or a method of 
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separately performing a calculating portion and an I/O 
processing portion on two independent computers are 
employed. Therefore, the cost is considerably 
inaeased. 

It is an object of tiie present invention to provide an 
I/O control apparatus capable of controlling an I/O 
device on one computer having a checkpoint recovery 
^ function without requiring a special interface circuit or 
requiring two independerrt conputers. 

Another object is to provide a software layer 
between an operating system kemei and an existing 
device driver which restores the state of the I/O devices 
when tiie computer system rolls back upon a fault 

According to the present invention, an I/O control 
apparatus in a computer system which has one or more 
CPUs, a main memory, and one or more I/O devices 
and in which the CPUs periodically save the internal 
state of the CPUs and the contents of the main memory 
as a checkpoint and tine internal state of tiie CPUs and 
tiie contents of the main memory of the most recent' 
checkpoint are restored when a fault occurs in the com- 
puter system to restart data processing, comprising: I/O 
device state storing means for storing log data of state 
setup of tiie I/O devices performed by the CPUs; and 
I/O de^ce state restoring means for restoring the state 
of the 1/0 devices to that of the nrost recent checkpoint 
by first initializing the I/O devices and second replaying 
state setup according to tine log data stored by the 1/0 
de^ce state storing mearts. 

According to this invention, when state setup such 
as operation mode setup is performed by the CPU to an 
I/O device, the log data of the state setup is stored in. 
e.g., a main memory. Upon occurrence of a fault in the 
computer system, an I/O device is initialized by an ini- 
tialize command or a reset signal assertion, and then 
the state setup sequence is replayed for tine I/O device 
according to the log data, so tiiat the state of tiie I/O 
device is restored to the state of :tiie most recent check- 
point 

It is often tiie case that a part of the log data 
becomes unnecessary because of new state setup and 
therefore the unnecessary part can be eliminated. For 
example, assume that an I/O device which has initial 
state of "state A" is set to "state B", and n©rt set to "state 
C", and then a checkpoint is acquired. In tnis case, upon 
the setting up of "state C", the log data for "state B" 
becomes unnecessary and can be eliminated. /Vnd 
upon the checkpoint acquisition, all the log data other 
than "state C" setting up becomes unnecessary and 
can be eliminated. Therefore, a means for eliminating 
tiie unnecessary part of the log data is provided, 
whereby the area required for the log data can be 
saved. In addition, the time required for replaying the 
state setup sequence after the I/O device initialization 
can be reduced. 

For an 1/0 device to which new state setup has not 
been performed since the preceding checkpoint initial- 
izing til e I/O dev-ice and replaying state setup sequence 
need not be performed upon occurrence of a fault For 
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this reason, means for skipping the initialization and 
state setup sequence of such an I/O device is arranged. - 
The time required for recovery can be further short- 
ened. 

This invention further comprises request block ere- 5 
ating means for creating, when an application process 
in the computer system makes an I/O request a request 
block in the main memory which contains infornoation 
necessary to perform the 1/0 request; I/O execution 
processes for performing 1/0 operations by accessing io 
the I/O devices according to request blocks; and 1/0 
execution process initializing means for initializing, upon 
restart from the most recent checkpoint which follows a 
fault occurrence, the ongoing I/O execution processes 
and causing the 1/0 operations being performed by the is 
I/O execution processes to be performed again from the 
beginning. 

According to the present invention, when an appli- 
cation process makes an 1/0 request, a request block 
which contains information necessary for the I/O opera- 20 
tion is created and executed by an 1/0 execution proc- 
ess. The application process moves in wait state until 
the end of the I/O operatioa 

Assume a fault occurs in the computer system dur- 
ing the 1/0 operation. While the state of the computer 2S 
system rolls back to the rrrast recent checkpoint 1/0 
device state restoring means restores the state of the 
1/0 device. Upon restart from the most recent check- 
point, the 1/0 execution process initializing means initial- 
izes the I/O execution process responsible for the IXD 30 
operation, and causes the I/O operation performed half- 
way to be performed again from the beginning. In the 
restart phase, an 1/0 execution process simply performs 
the I/O operation according to the request block. 

The fact that an 1/0 operation is performed by an ss 
I/O execution process, not by the application process 
itself enables the 1/0 operation to be restarted from the 
beginning. If the application process did the I/O opera- 
tion as in a conventional way, it would be difficult or 
impossible to restart the I/O operation from the begin- 4o 
ning. It should be noted that at the most recent check- 
point, the 1/0 operation is being performed halfway. 

Of the request blocks stored in the main memory, 
request blocks which were created before the most 
recent checkpoint should be processed by 1/0 execution 45 
processes, and the execution, of request blocks which ^ 
were created after the most recent checkpoint should 
be postponed until the next checkpoint 

Generally, if a rollback occurrs, the second time 
data processing from the most recent checkpoint is not , so 
always the same as the first time data processing 
because of the real time clock and asynchronous events 
(i.e. external intenupt). Therefore, an 1/0 request made 
by the data processing before fault occurrence may not 
be made or may be made differently by the second time ss 
data processing after the fault recovery. Therefore, it is 
necessary to postpone the execution of request blocks 
created after the most recent checkpoint until a new 
checkpoint-is acquired. 
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When a checkpoint has been acquired, many 
request blocks turn executable. Therefore it is efficient 
to allocate, upon a checkpoint acquisition, many CPUs 
to I/O execution processes so that 1/0 operations are 
performed with small delay. 

To make the CPU which has executed the applica- 
tion process which requires an 1/0 operation also per- 
form the 1/0 execution process which is responsible for 
tile 1/0 operation, leads to an increase in cache hrt ratio. 

To determine the number of CPUs allocated to I/O 
execution processes properly, depending on the 
number of request blocks to be processed, improves the 
system performarice. 

Assume that while an I/O execution process is exe- 
cuting a device driver routine which outputs a character 
string to a printer unit a fault occurs in the computer 
system. Then, only a part of the characters may be 
printed out on the paper and cannot be erased. In tiiis 
case, the application process which made the I/O 
..request should receive an error reply so tiiat the appli- 
cation process may do an error recovery at application 
level like a printer jam error. 

Assume ti^at an I/O execution process complete 
tiie execution of a device driver routine which outputs a 
character string to a printer unit and tiien a fault occurs 
in the computer system. In this case, the whole- charac- 
ter string has been printed out. If the request block was 
executed again after tiie fault recovery, it would results 
in duplicated print Therefore, it is desirable that the 
application process which made tiie I/O request even 
when a fault occurs in tiie computer system, receives an 
successful I/O completion reply without re-executing tiie 
1/0 request, in case lhaX the I/O request is an output 
request and has been completed before the fault 
occurs. 

This invention can be more fully understood from 
tine following detailed description when taken in con- 
junction witii the accompanying drawings./m which: 

FIG. 1 is a view showing tiie arrangement of a com- 
puter system using a conventional checkpoint 
restart scheme; 

FIG. 2 is a timing chart in an I/O process of tiie 
computer system shown in FIG. 1 ; _ 
FIG. 3 is a view showing an arrangement in which 
1/0 control is implemented by a computer system 
using a conventional checkpoint restart scheme; 
FIG. 4 is a schematic view showing the an-ange- 
ment of a computer system according to tiie first 
embodiment of the present invention; 
FIG. 5 is a flow diagram of a config routine in the 
first embodiment; 

FIG. 6 is a flow diagram of a checkpoint acquisition 
in the first embodiment; 

FIG. 7 is a flow diagram of a fault recovery in the 
first embodiment; 

FIG. 8 is a flow diagram of a config routine in the 
first emkx>diment; 

FIG.- 9 Is a flow diagram of a checkpoint acquisition 



4 



EP 0 788 052 A1 



8 



in the first embodiment; 

FIG. 10 is a flow diagram of a fault recovery in the 
first embodiment; 

FIG. 1 1 is a flow diagram of an application process 
which makes an 1/0 request in the second embodi- 5 
ment of the present invention; 
FIG. 12 is a flow diagram of an I/O execution proc- 
ess in the second embodiment; 
FIGS. 13A through 13D show how I/O operations 
are performed by application'process and 1/0 exe- " io 
cution process in the second embodiment; 
FIG. 14 is a flow diagram of a checkpoint acquisi- 
tion in the second embodiment; 
FIG. 15 is a flow diagram of a fault recovery in the 
second embodiment; is 
FIG.. 1 6 is a flow diagram of an application process 
which makes an I/O request in the second embodi- 
ment; 

FIG. 1 7 is a flow diagram of an 1/0 execution proc- 
ess in the second ennbodiment;- / 20 
FIG. 18 is a flow diagram of a checkpoint acquisi- 
tion in the second embodiment; 
FIGS. 19A through 19E show how 1/0 operations 
are performed with delay by the application proc- 
esses and the I/O execution processes In the sec- 2S 
ond embodiment; 

FIG. 20 is a flow diagam of a fault recovery in the 
second embodiment; 

FIG. 21 is a flow diagram of an application process 
which makes an I/O request in the third embodi- 30 
ment of the present invention; 
FIG. 22 is a flow diagram of an I/O execution proc- 
ess in the third embodiment; 
FIG. 23 is a flow diagram of a checkpoint acquisi- 
tion in the third embodiment; 35 
FIG. 24 is a flow diagram of a fault recovery in the 
third embodiment; 

FIG. 25 is a flow diagram of an application process 
which makes an I/O request in the third ennbodi- 
ment; and 40 
FIG. 26 is a flow diagram of an 1/0 execution proc- 
ess in the third embodiment. 

Embodiments of the present invention will be 
described below with reference to the accompanying 45 
drawings. 

(First Embodiment) 

The first embodiment of the present invention will so 
be described below with reference to FIG. 4. FIG. 4 is a 
schematic view showing a computer system according 
to the first embodiment. • , 

As shown in FIG. 4, the computer system of this 
embodiment comprises CPUs la and lb. a memory 55 
state recovery unit 2, a main memory 3, and 1/0 devices 
4a and 4b such as a printer and an RS232C controller. 

When the content of the main memory 3 is updated 
by the CPUs 1 a orl b, the memory state recovery unit 2 



holds the before image to restore the contents of the 
main memory 3. A detail of a memory state recovery 
unit 2 is described in C. Kubiak et al.. PENELOPE: A 
RECOVERY fviECHANISM FOR TRANSIENT HARD- 
WARE FAILURES AND SOFTWARE ERRORS. FTCS. 
1982. The context of an application process including a 
stack area and data area is stored in the main memory 
3 as context information 31. Here, an application proc- 
ess means a process of a conventional computer sys- 
tem. 

The operating system 33, more specifically the 
printer device driver and RS232C device driver sets up 
operation mode such as a baud rate, a stop bit. a parity, 
and the like when the system is initialized or an applica- 
tion process requests. The set up operation mode is 
stored in a state setting storage area 34 as log data. 

For example, in a typical UNIX operating system, 
state set up sequence to an 1/0 device such as an 
RS232C controller is performed by a device driver rou- 
tine named xxcortfig. the interface of which is common 
to all the device drivers. Therefore, the parameters of 
the config routine is preferably stored in the state setting 
storage area 34 of the main memory 3 at the entry of the 
config routine, and the parameters of the config routine 
can be recorded in the same way not depending on the 
1/0 device type. 

FIG. 5 shows a flow diagram of a config routine for 
each 1/0 device. 

(1) Store parameters of the config routine in the 
main memory as a state setting up value (step Al). 

(2) Set up the state of the I/O device (step S2). 

FIG. 6 shows a flow diagram of a checkpoint acqui- 
sition. 

(1) Save the internal state of the CPU i.e.. the con- 
tents of registers and the cached data into the main 
memory (step B1). 

(2) Clear data held in the main memory recovery 
unit 

FIG. 7 shows a flow diagram when the set up 
sequence is replayed from the niost recent checkpoint 
upon occurrence of a fault ^ 

(1) Initialize the 1/0 device by a reset command or 
reset signal assertion. 

(2) Restore the state of the main memory to the 
state of the most recent checkpoint by using the 
memory state recovery unit (step C2). As a result, 
the log of state setting up to the I/O device is 
restored to the state of the most recent checkpoint. 

(3) Execute the config routines again using the con- 
. " fig parameters stored in the main memory (step 

C3). This re-execution is performed from the oldest 
to the newest. As a result, the state of the 1/0 device 
is recovered to the state of the checkpoint. 

(4) Restart data processing which was being per- 
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formed at the checkpoint (step C4). 

Therefore, data processing is restarted in a state 
wherein the state oi each 1/0 device is restored to the 
state of the checkpoint. This means that the checkpoint 
recovery mechanism and I/O device restoring mecha- 
nism are realized in a single computer, ' 

For a RS232C. controller, when operation modes 
such as a baud rate, a stop bit. and a parity are newly 
set. the old setting up values turn unnecessary. There- 
fore, it is suffiderrt that only the latest values of the oper- 
ation modes of the RS232C controller 4b are held in the 
state setting storage area 34 of the main memory 3 
(unnecessary log data is discarded). Then, the fault 
recovery time becomes reduced. 

A state holding area 36 is effectively arraniged in the 
n^in memory 3 to manage the state setting flags of the 
1/0 devices 4a and 4b. The state setting up flags are 
-managed in the following manner. The state setting up 
flag ON indicates that some state setting up sequence 
has been performed or being performed to the I/O 
device since the most recent checkpoint, and the state 
setting up flag OFF indicates that no state setting has 
been performed to the 1/0 device since the most recent 
chedqDoint. 

FIG. 8 shows a fiow diagram of a config routine in 
case the above state setting up flags are employed. 

(1) Store, parameters of the config routine into the 
main, memory (step D1). Turn on the state setting 
up flag of the 1/0 device. 

(2) Set up the state of the 1/0 device (step D2). 

FIG. 9 shows a flow diagram of a checkpoint acqui- 
sition in this case. 

(1) Save the internal state of the CPU into the main 
memory (step El). 

(2) Turn off the state setting up flag of each I/O 
device (step E2). 

(3) Clear data held in the memory state recovery 
unit (step E3). 

FIG. 10 shows a flow diagram for fault recovery in 
this case. 

(1) If the state setting up flag of a certain I/O device 
is ON, initialize the I/O device since it implies that a 
new state has been set up since the most recent 
checkpoint (step F1). On the other hand, an I/O 
device with state setting up flag OFF does not need 
to be initialized. 

(2) Restore the state of the main memory to the 
state of the most recent checkpoint by using the 
memory state recovery unit (step F2). As a result, 
the log of state setting up to the 1/0 device is recov- 
ered to the state of the most recent checkpoint. 

(3) With respect to only an 1/0 device which has 
been initialized in step F1. execute the config rou- 



tines again by using the config parameters stored in 
the main memory (step F3). This re-execution is 
performed from the oldest to the newest As a 
result tiie state of tiie I/O device is recovered to the 
5 state of the most recent checkpoint 

(4) Restart data processing which was being per- 
formed at the nrost recent checkpoint (step F4). 

In this manner, the fault recovery can skip initiaiiz- 
w ing an 1/0 device with the recovery setting up flag OFF. 
which results in faster fault recovery. 

(Second Embodiment) 

IS It is assumed that a computer system according to 
this embodiment comprises, in addition to the arrange- 
ment of the computer system described in the first 
embodiment an arrangement having a request block 
storage area 35 and 1/0 execution processes (FIG. 4). 

20 In this embodiment, when an application process 
makes a system call to request an 1/0 operation of an 
1/0 device, the operating system, instead of calling the 
device driver routine in the context of tine caller process, 
calls a request block create routine. The request block 

25 create routine creates a request block having the entry 
address of the device driver routine and the parameters. 
Here, tiie application (caller) process moves to wait 
state. The request block is simply held in the main mem- 
ory until a new checkpoint is acquired. Therefore, if 

30 tiiere are a lot of application processes in tiie computer 
system, the number of the request blocks held in tiie 
main memory would increase as time goes. 

There are a certain number of 1/0 execution proc- 
esses in tine system. An 1/0 execution process is a spe- 

35 dal process to execute device driver routines according 
to a request block. 

When a new checkpoint has been acquired, the 
request blocks become ready to be processed. /\n 1/0 
execution process with initial state is allocated to one of 

40 tine request blocks. The 1/0 execution process executes 
the device driver routine with the appropriate parame- 
ters both of which are designated by the request block. 
Therefore, tiie number of I/O operations being per- 
formed concurrentiy depends on the number of I/O exe- 

45 cution processes. The 1/0 execution process fnoves into 
wart state when it invokes the 1/0 device within the des- 
ignated device driver routine. Wheri the 1/0 device 
returns a termination intenrupt. the interrupt handling 
routine of the device driver is called and the result is 

50 reflected to the I/O execution process^context Then, the 
I/O execution process turns ready At the end of the 1/0 
operation, the I/O execution process reports the result 
to the application process via the request block. 

If a fault occurs in tiie computer system during DMA 

55 transfer from an I/O device to the main memory, it is 
necessary to stop the DMA transfer by initializing the I/O 
device before the main memory is restored. For this pur- 
pose, an in-operation flag is employed for each 1/0 unit 
with DMA capability in order to determine whether each 
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lyO device must be initiaiized or not An in-operation flag 
is controlled to be ON only while the corresponding 1/0 
device performs DMA transfer, 

FIG. 1 1 shows an I/O request flow diagram per- 
formed by an application process in this embodiment. 

(1) Store parameters relating to an I/O operation in 
the main memory as a request block (step Gl). 

(2) Make transition to a wait state until the I/O oper- 
ation described in the request block is completed 
(step G2). 

(3) When the applica^on process resumes its exe- 
cution*, perform a completion step at the application 
process side relating to the 1/0 request with refer- 
ence to a result code field of the request block and 
then execute a succeeding step (step G3). 

FIG. 12 shows a flow diagram of an I/O execution 
process. Here, it is assumed that a multiplicity of I/O 
execution processes are executed concurrently. 

(1) Wait for an executable request block (initial 
state, step HI). 

(2) Set up registers of the I/O device and turn on the 
in-operation flag in accordance with the request 
block, thereby starting the I/O device (step H2). 

(3) Upon receiving completion interrupt from the I/O 
device, turn off the in-operation flag, perfonn a 
completion step of the I/O request write result code 
in the request block, and put the application proc- 
ess which has been in wait state into ready state 
(step H3). 

FIGS. 13A through 13D show a sample sequence 
of I/O operations performed by two application proc- 
esses and two I/O execution processes. 

When an I/O request is made by an application 
process, a corresponding request block is created and 
stored in the memory. (FIGS. 13A and 13B) After the 
request block is created, an 1/0 execution process (reg- 
ister setting up, starting, and conpletion interrupt 
processing of the I/O device) is executed under the I/O 
execution process context (FIGS. 130 and 13D).. 

FIG. 14 shows a flow diagram of checkpoint acqui- 
sition in this embodiment 

(1) Save the internal state of the CPU into the main 
memory. (step II). 

(2) Turn off the state setting up flag of each 1/0 
device (step 12). 

(3) Clear data held in the memory state recovery 
unit (step 13). 

FIG. 15 shows a flow diagram of a fault recovery 
upon occurrence of a fault in this embodiment. 

(1) If the conesponding state setting up flag of a 
certain I/O device is ON or the in-operation flag is 
ON. initialize the I/O device. Turn off the state set- 
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ting up flag and the in-operation flag (step J1). 

(2) Restore the state of the main memory to the 
state of the most recent checkpoint by using the 
memory state recovery unit (step J2). 

(3) For only the 1/0 devices which have been initial- 
ized at step J1, replay state set up sequence from 

- tiie oldest to the newest witii reference to the log of 
the state setting up values held in the main memory 
(step J3). In this manner, the state of the I/O device 
is recovered to the state of the most recent check- 
point. 

(4) Initialize I/O execution processes. More specifi- 
cally, the I/O execution processes are set to step HI 
in-espective of the state of the I/O execution proc- 
ess at the most recent checkpoint (step J4). 

(5) Restart data processing which was being per- 
formed at the most recent checkpoint (step J5). 

It is important that, control of an I/O device is per- 
formed in the context of an 1/0 execution process, which 
is different from an application process, initialization at 
step J4 can be performed without any influence on the 
application process which requested the 1/0 operation, 
in tiie prior art. since an 1/0 operation is performed in 
tiie application process context, the above initialization 
would need a much corrplicated or ad hoc process. 

It is effective to add an execution permission flag to 
each request block. This execution permission flag is 
controlled such that the execution pemnission flag 
remains OFF until a new checkpoint is acquired, and is 
turned ON when a new checkpoint has been acquired. 

FIG. 16 shows a flow diagram performed by an 
application process when the execution permission flag 
is added. 

- (1) Store parameters relating to a unit I/O operation 
into the main memory as a request block (step Kl). 
Turn off the execution permission flag of the request 
block. 

(2) Make transition to wait state until the I/O opera- 
tion designated in the request block is completed 
(step K2). 

(3) When the application process resumes its exe- 
cution, perform a completion step at tiie application 
process, side relating to the I/O request with refer- 
ence to the result code field of the request block 
and then execute a succeeding step (step K3). 

FIG. 1 7 shows a flow diagram of an I/O execution 
50 process. In this case, it is assumed that a multiplicity of 
I/O execution processes are executed concurrentiy. 

(1) Wait for a request block whose execution per- 
mission flag is ON (step LI). 
55 : (2) Set up registers of the I/O device and turn on the 
in-operation flag in accordance with the request 
block, thereby starting the I/Q device (step L2). - 
(3) Upon receiving completion intenupt f rom the 1/0 
device, turn off the in-operation flag, perform a 
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completion step of the I/O request, write the result 
code in the request block, and put the application 
process which has been in wait state into ready 
state (step L3). 

5 

FIG. 18 shows a flow diagram of a checkpoint 
acquisition. 

(1) Save the internal state of the CPU into the main 
memory (step Ml). w 

(2) turn off the state setting up flag of each !/0 
device (step M2). Turn on the execution permission 
flag of each request block stored-in the main mem- 
ory. 

(3) Clear data held in the memory state recovery is 
unit (st^ M3). 

FIGS. 1 9A through 1 9E show how the I/O requests 
are performed. Here, two application processes, a 
checkpoint acquisition, and two I/O execution proc- 20 
esses are related. 

When an I/O request is made by an application 
process, a con-esponding request block is created and 
stored In the memory (FIGS. 19A and 19B). Since the 
execution permission flag of the request block remains 25 
OFF until a checkpoint is acquired, the 1/0 execution 
process stays in idle state. 

When a checkpoint has been acquired, the execu- 
tion permission flag of the request block is turned on 
(FIG. 19C). An I/O execution process takes a request 30 
block whose execution permission flag is ON and exe- 
cutes the I/O operation (setting up the registers of the 
1/0 device, starting the I/O device, and handing conple- 
tion intenojpt of the I/O device) (FIGS. 19D and 1 9E). 

FIG. 20 shows a flow diagram of a fault recovery 3S 
and re-execution of an I/O operation. 

(1) If the state setting up flag of an I/O device is ON 
or tiie in-operation flag is ON. initialize the I/O 
device (step N1). Turn off the state setting up flag 4C 
and the in-operation flag. :. 

(2) Restore the state of the main memory to the 
state of the most recent checkpoint by using the 
memory state recovery unit (step N2). 

(3) With respect to only I/O devices which have 45 
been initialized at step N1, state setting up is per- 
formed again from the oldest to the newest with ref- 
erence to the log of the state setting up stored in the 
main memory (step N3). The state of the 1/0 device 

is recovered to tiie state of the most recent check- so 
point 

(4) Initialize ongoing I/O execution processes (step 
N4). More specifically, tiie I/O execution processes 
are set to the state of step LI irrespective of the 
state of the I/O execution process at the most 55 
recent checkpoint. 

(5) Restarts data processing which was being per- 
formed at the most recent checkpoint (step N5). 



Assume that a fault occurs in the middle of an I/O 
operation. When a fault occurrs. tine in-operation flag of 
the I/O device is on and therefore the I/O device is ini- 
tialized. The contents of the request block is rolled back 
to that of the most recerrt ched^Doint by the memory 
state recovery unit Then, the state of the I/O device is 
. recovered by re-pertorming the set up sequence held in 
tile main memory (step N3). When the fault recovery 
step has been completed, an I/O process takes the 
request block and re-executes the I/O operation accord- 
ing to tine request block from the beginning. 

This is the way how an 1/0 operation intenupted 
halfway is re-performed after a fault recovery. 

When a checkpoint has been acquired, tine request 
blocks which have been created since the preceding 
checkpoint turn executable. Therefore, it is appropriate 
to set the priority of tine I/O execution processes higher 
after a checkpoint acquisition, so tinat the delayed I/O 
requests are executed immedately. 

To keep the cache hit-ratio high, the CPU which 
executed an application process which made an 1/0 
request should be assigned to the I/O execution proc- 
ess which is responsible for tine request block created 
by the 1/0 request 

A preferable embodiment is as follows, 

A request block has a^CPU identifier field. When the 
request block is aeated, the identifier of tine CPU which 
executes the appilicatlon process is written into the CPU 
identifier field. An 1/0 execution process takes a request 
block having the same CPU identifier with the CPU 
which executes the I/O execution process. 

The number of CPUs which are assigned to tine I/O 
execution processes should be determined according to 
tine number of the executable request blocks and the 
number of CPUs of the computer. If the number of tine 
executable request blocks increases, nx)re CPUs 
should be assigned to tine I/O execution processes. One 
preferable embodiment is tinat tine scheduler of the com- 
puter determines whether an idle state CPU is assigned 
to an application process or an I/O execution process 
depending on the number of executable request blocks. . 

(Third Ennbodiment) 

A computer system according to this eriSodiment 
comprises, in addition to the arrangement of the compu- 
ter system described in the second emt)odiment. a state 
holding area 36. 

In. this embodiment, attention is focused on an 1/0 
device such as a printer. For a printer, when a fault 
occurrs in tine computer while a slip is being printed, tine 
slip may be left in an incomplete state (i.e.. the printer 
cannot be restored to the stale of the most recent 
checkpoint or finish printing the slip completely.) 

In .order to detect such a state, an execution inter- 
ruption en-or flag is added to a request block in this 
embodiment This execution interruption en-or flag is 
used to identify whther the I/O opera tion designated by 
a request block results in an unrecoverable I/O error by 
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a fault occurrence in the computer system. 

FIG. 21 shows a flow diagram performed by an 
application process in this embodiment 

(1) Store parameters relating to an I/O operation in s 
the main memory as a request block (step 01). 
Turn off an execution permission flag of the request 
block, and turn off the execution interruption error 
flag of the request block. 

(2) Make transition to wait state until the I/O opera- io 
tion designated in the request block is completed 
(step 02). 

(3) When the application process resumes its exe- 
cution, perform a completion step at the application 
process side relating to the I/O request wHh refer- 75 
ence to the result code field of the request block 
and then execute a succeeding step (step 03). 

FIG. 22 shows a flow diagram of an I/O execution 
process. 20 

(1) Wait for a request block whose execution per- 
mission flag is ON (step PI). 

(2) if the execution interruption error flag of the 
request block is ON, set an error codejn tine result 25 
code field of the request block, and set the applica- 
tion process which has been in wait state to ready 
state (steps P2. P5). 

(3) Otherwise, set up registers of an. I/O device and 
turn on the in-operation flag in accordance with the 30 
request block (step P3). Turn on the execution inter- ^ 
mption error flag of tine request block if the 1/0 
device is a printer. 

(4) Upon receiving completion interrupt from the I/O 
device, turn off the in-operation flag, perform a 35 
conpietion step of the 1/0 request write result code 

in tile request blocK and put the application proc- 
ess virhich has been in wait state into ready state 
(step P4). Turn off the execution Inten-uption error 
flag of tile request block if the 1/0 device is a printer. 40 

FIG. 23 shows a flow diagram of a checkpoint 
acquisition in this embodiment. 

(1) Save tiie internal state of the CPU into the main 45 
memory (step 01). 

(2) Turn off the state setting up flag of each 1/0 
device (step 02). Turn on the execution permission 
flag of each request block stored in tiie main mem- 
ory. 50 

(3) Clear data held in the memory state recovery 
unit (step 03). 

FIG. 24 shows a flow diagram of a fault recovery 
upon occurrence of a fault in this embodiment. 55 

(1) If the state setting up flag of an I/O device is ON 
or the in-operation flag is ON, initialize tfie 1/0 
device (step R1). Turn off the state setting up flag 



and the in-operation flag, (step R1) 

(2) Restore the state of the main memory to the 
state of the most recent checkpoint by using the 
memory state recovery unit. 

With respect to the execution intermption en'or 
flag of the request block corresponding to a printer, 
the value of the flag must be unchanged through 
restoring the main memory. This operation is real- 
' ized. for instance, in tiie follovwng way. An ordinary 
computer system has an NVRAM (nonvolatile 
memory) for holding system parameters, and data 
update* in the NVRAM can be conti'dled such ttiat 
the state is not restored by the main memory recov- 
ery unit Therefore, when a fault occurrs. by saving 
the value of the execution interruption error fiag in 
the NVRAM Ijefore tfie main memory restoration, 
and writing back the saved value into tiie flag after 
the main memory restoration, (step R2) 

(3) Wrth respect to only an I/O device which has 
been initialized at step R1 . state set up sequence is 
re-performed from the oldest to tiie newest vyitii ref- 
erence to the log of the state setting up held in the 

. main menrory (step R3). In ttiis manner, tiie state of 
tiie I/O device is recovered to the state of the most 
recent checkpoint 

(4) Initialize ongoing I/O execution processes (step 
R4). More specifically, the I/O execution processes 
are set to tiie state of step Pi in-espective of the 
state of the I/O execution process at the most 
recent checkpoint. 

(5) Restarts data processing which was being per- 
formed at the most recent checkpoint (step R5). 

When a fault occurs in the middle of a printer 1/0 
operation according to a request block, the execution 
interruption en*or flag is ON and it remains unchanged 
through the main memory restoration. Then an I/O exe- 
cution process tries to re-execute tiie.request block and 
it finds tiie execution intermption en-or flag is ON (at 
step P2). The I/O execution process, instead of re-exe- 
cuting the I/O operation, sets an en-or code in the result 
code field of the request block, and sets the application 
process into ready state. 

in this manner, when a printer I/O operation is inter- 
rupted halfway because of a fault occurrence, tiie 
printer I/O operation is not repeated again, but the appli- 
cation process copes with the en-or like a printer jam 
error. ^ • 

In case of a printer, it is more suitable to have tiie 
execution interruption error flag as a ternary flag, i.e., 
compi etion/in-execution/non-execution . 

FIG. 25 shows a flow diagram performed by an 
application process in this case. 

: (1) Store parameters with respect to an I/O opera- 
tion into the main memory as a request bfock (step 
SI). Turn off the execution permission flag of tiie 
request block, and set the execution interruption 
error flag of tiie request block to non-execution. 
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(2) Make transition to wait state until the I/O opera- 
tion described in the request block is conpieted 
(step S2). 

(3) When the application process resumes its exe- 
cution, perfomn a completion st^ at the application s 
process side relating to the I/O request with refer- 
ence to the result code field of the request block 
and then execute a succeeding step (step S3). 

FIG. 26 shows a flow diagram of an I/O execution io 
process. 

(1) Wait for a request block whose execution per- 
mission flag is ON (step T1). 

(2) If the execution interruption error flag of the is 2. 
request block is in-execution, set an error code in 

the result code field of the request block, and set 
the application process which has been in wait 
state to a ready state (steps T2, T6). 

(3) Otherwise, if the execution intermption error flag 20 3. 
is completion, set a conpletion code in the result 

code field of the request block, and set the applica- 
tion process which has been in wait state to a ready 
state (steps T3. T7). 

(4) Otherwise, set up registers of an I/O device and 2S 
turn on the in-operation flag in accordance with the 
request block (step T4). 4. 

(5) Upon receiving completion interrupt from the I/O 
device, turn off the in-operation flag, set the execu- 
tion interruption en-or flag to completion, and set the 30 
application process which has been in wait state to 
ready state (step T5). 

When a fault occurs after the end of a printer I/O 
operation, the execution interruption error flag is com- 3S 
pletion and it remains unchanged through the main 
memory restoration. Then an 1/0 execution process 
tries to re-execute the request block and it finds the exe- 
cution interruption en-or flag shows completion (at step 
T3). The I/O execution process, instead of re-executing 40 
the 1/0 operation, sets a termination code in the result 
code field of the request block, and sets the application 
process into a ready state. 

In this manner, when a printer 1/0 operation has 
been completed before the occurrence of a fault in the 45 
computer, the printer I/O operation is not repeated ' 
again, but the application process receives a result 5. 
code. 

Claims so 

1. An 1/0 control apparatus in a computer system 
which has one or more CPUs (la, 1b), a main 
memory (3), and one or more 1/0 devices (4a. 4b) 
and characterized in that said .CPUs periodically ss 
save the internal state of said CPUs and the con- 
tents of said main memory as a checkpoint, and the 
internal state of said CPUs and the contents of said 
main memory of the most recent checkpoint are 



restored when a fault occurs in said computer sys- 
tem to restart data processing, comprising: 

1/0 device state storing means (34, A1 . /\2, la) 
for storing log data of state setting of said I/O 
devices performed by said CPUs; and 
1/0 device state restoring means (la, Cl, C2. 
C3) for restoring the state of said I/O devices to 
that of the most recent checkpoint by first ini- 
tializing said 1/0 devices and second replaying 
state setting up sequence according to said log 
data stored by said 1/0 device state storing 
means. 

An apparatus according to claim 1 , characterized in 
that said storing means indudes means (1a, 02) for 
erasing part of the existirig log data which is made 
unnecessary by setting up new state. 

An apparatus according to claim 1 , characterized in 
that said 1/0 device state restoring means indudes 
means (la. F1) for skipping initializing and replay- 
ing state setting up sequence of I/O device charac- 
terized in that new state setting has not been 
perfomned since the most recent checkpoint 

An apparatus according to claim 1, characterized 
by further comprising:. 

request block creating means (35) for creating, 
when an application process in said computer 
system makes an 1/0 request, a request block 
in said main memory which contains informa-: 
tion necessary to perform said I/O request; 
I/O execution process (32, FIG. 4, H1 . H2. H3 
in FIG, 12) for performing 1/0 operation by exe- 
cuting 1/0 device driver routines according to a 
. request block; 

1/0 execution processes initializing means (J4 
in FIG. 15) for initializing, upon restart from the 
most recent checkpoint after a fault occur- 
rence, said I/O execution processes other than 
in initial state and causing 1/0 operations being 
performed by said I/O execution processes to 
be performed again from the beginning. 

An. apparatus according to claim 2, characterized 
by further comprising: 

request block creating means (35) for creating, 
when an application process in said computer 
system makes an I/O request, a request block 
in said main memory which contains informa- 
tion necessary to perform said 1/0 request: 
: I/O execution processes (32, FIG, 4. HI, H2, 
H3 in FIG. 12) for performing an 1/0 operation 
by executing 1/0 device driver routines accord- 
ing to a request block; 

I/O execution process initializing means (J4 in 
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FIG. 15) for initializing, upon restart from the 
most recent checkpoint after a fault occur- 
rence, said I/O execution processes other than 
in initial state and causing I/O operations being 
performed by said I/O execution processes to $ 
be performed again from the beginning. 

6. An apparatus according to claim 3. characterized , 
by further comprising: 

request block creating means (35) for creating, 
when an application process in said computer 
system makes an 1/0 request a request block 
in said main memory which contains informa- 
tion necessary to perform said' 1/0 request; is 
t/0 execution processes (32, FIG.. 4, HI, H2, 
H3 in FIG. 12) for performing an 1/0 operation 
by executing I/O device driver routines accord- 
ing to a request block; 

I/O execution processes inrtiaiizing means (J4 20 
in FIG. 15) for initializing, upon restart from the 
most recent checkpoint after a fault occur- 
rence, said 1/0 execution processes other than 
in initial state and causing I/O operations being 
performed by said 1/0 execution processes to zs 
be performed again from the beginning. 

7. An apparatus according to claim^ 4, characterized in - 
that, of request blocks held in said main memory, 
said I/O execution processes begin to perform an 30 
I/O operation according to a request block aeated 
before the most recent checkpoint while said I/O 
execution processes postpone an I/O operation 
according to a request block created after the most 
recent checkpoint until a new checkpoint acquisi- 35 
tion, 

8. An apparatus according to claim 5, characterized in 
that, of request blocks held in said main memory, 
said 1/0 execution processes begin to perform an 40 
1/0 operation according to a request block created 
before the most recent checkpoint while said 1/0 . 
execution processes postpone an I/O operation 
according to a request block created after the most 
recent checkpoint until a new checkpoint acquisi- 45 
tion. 

9. An apparatus according to claim 6, characterized in 
that, of request blocks held in said main memory, 
said I/O execution processes begin to perform an so 
I/O operation according to a request block aeated 
before the most recent checkpoint while said I/O 
execution processes postpone an I/O operation 
according to a request block created after the most 
recent checkpoint until a new checkpoint acquisi- 55 
tion. 

1 0. An apparatus according to claim 7, characterized in 
that said CPUs are assigned to said 1/0 execution 



processes when a new checkpoint acquisition has 
been completed. 

11. An apparatus according to claim 8, characterized in 
that said CPUs are assigned to said I/O execution 
processes when a new checkpoint acquisition has 
been completed. 

12. An apparatus according to claim 9, characterized in 
that said CPUs are assigned to said I/O execution 
processes when a new checkpoint acquisition has 
been completed. 

13. An apparatus according to claim 4. characterized in 
that the CPU which executes an application proc- 
ess which made an I/O request also executes an 
1/0 execution process which is responsible for the 
request block created based on said I/O request. 

1 4. An apparatus according to claim 5, characterized in 
that the CPU which executes an application proc- 
ess which made an I/O request also executes an 
yO execution process which is responsible for the 
request block created based on said I/O request 

15. An apparatus according to claim 6, characterized in 
that the CPU which executes an application proc- 
ess which made an I/O request also- executes an 
I/O execution process which is responsible for the 
request block aeated based on said 1/0 request 

1 6. An apparatus according to claim 4, characterized in 
that the number of CPUs which are assigned to 
said I/O execution processes is properly deter- 
mined depending; on the number of request blocks 
to be processed 

17. An apparatus according to claim 5, characterized in 
that the number of CPUs which are assigned to 
said 1/0 execution processes is properly deter- 
mined depending on the number of request blocks 
to be processed. 

18. An apparatus according to claim 6, characterized in 
that tiie number of CPUs which are assigned to 
said I/O execution processes is properiy deter- 
mined depending on the number of request blod^ 
to be processed. 

19. An apparatus according to claim 4, characterized 
t)y further comprising means (P2 in FIG. 22) for 
making, when a fault occurs, an error reply to the 
application process without re-executing the 
requested I/O operation, in case that said I/O 

: 'device state restoririg means does not manage to 
restore the state of the I/O device which relates to 
said I/O request 

20. An apparatus according to claim 5. characterized 
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by further comprising means for making, when a 
fautt occur rs, an error reply to the application proc- 
ess without re-executing the requested I/O opera- 
tion, in case that said I/O device state restoring 
means does not manage to restore the state of the 
I/O device which relates to said I/O request. 

21. An apparatus according to claim 6, characterized 
by further comprising means for making, when a 
fault occurs, an error reply to the application proc- 
ess without re-executing the requested I/O opera- 
tion, in case that said I/O device state restoring 
means does not manage to restore the state of the 
I/O device which relates to said 1/0 request. 

22. An apparatus according to claim 7, characterized 
by further comprising means for making, when a 
fault occurs, an error reply to the application proc- 
ess without re-executing the requested I/O opera- 
tion, in case that said I/O device state restoring 
means does not manage to restore the state of the 
1/0 device which relates to said 1/0 request. 

23. An apparatus according to claim 8, characterized - 
by further comprising means for making, when a 
fautt occurs, an error'reply to the application proc- 
ess without re-executing the requested I/O opera- 
tion, In case that said I/O device state restoring 
means does not manage to restore the state of the 
I/O device which relates to said I/O request. . 

24. An apparatus according to claim 9. characterized - 
by further comprising means for making, when a 
fault occurs, an error reply to the application proc- 
ess without re-executing the requested t/0 opera- 
tion, in case that said I/O device state restoring 
means does not manage to restore the state of the 
I/O device which relates to said I/O request. 

25. An apparatus according to claim 4, characterized 
by further comprising means for making, when a 
fautt occurs, an successful 1/0 completion r^ly to 
the application process without re-executing the 
requested I/O operation, in case that the I/O 

- request is an output request and the I/O request 
has been completed before the fault occurrence. 

26. An apparatus according to claim 5, characterized 
by further comprising means for making, when a 
fautt occurs, an successful I/O completion reply to 
the application process wrthout re-executing the 
requested I/O operation, in case that tiie I/O 
request is an output request and the I/O request 
has been completed before the fault occurrence. 

27. An apparatus according to claim 6, characterized 
by further comprising means for making, when a 
fault occurs, an successful I/O completion reply to 
the application process without re-executing the 



requested I/O operation, in case that the I/O 
request is an output request and the I/O request 
has been completed before the fautt occurrence. 

5 28. An apparatus according to claim 7, characterized 
by further comprising means for making, when a 
fault occurs, an successful I/O completion reply to 
the application process without re-executing the 
requested I/O operation, in case that the I/O 

10 request is an output request and the I/O request 
has been completed before the fault occurrence. 

29. An apparatus according to claim 8, characterized, 
by further comprising means for making, when a 

15 fautt of said computer system occurs, an successful 
1/0 completion replay to the I/O requesting process 
without re-executing the requested 1/0 operation; in 
case that the 1/0 request is an output request and 
the I/O request has been completed before tine fault 

20 occun-ence. 

30. An apparatus according to claim 9, characterized 
by further comprising means for making, when a 
fautt occurs, an successful 1/0 completion reply to 

25 tiie application process without re-executing the 
requested I/O operation, in case that the I/O 
request is an output request and the I/O request 
has been completed before the fautt occurrenca 

30 31. An 1/d control method in a computer system which 
has one or more CPUs, a main menwry. and one or 
more 1/0 devices and characterized in that said 
CPUs periodically save the internal state of said 
CPUs and the contents of said main memory as a 

35 checkpoint, and tiie internal state of said CPUs and 
tiie contents of said main memory of the most 
recent check pint are restored when a fault occurs 
in said computer system to restart data processing, 
comprising: 



40 



45 



50 



55 



storing (Al in FIG. 5) log data of state setting of 
said I/O devices performed by said CPUs; and 
restoring (B1 in FIG. 6) tiie state of said 1/0 
devices to that of the most recent checkpoint by 
first initializing said I/O devices and second 
replaying state setting up sequence according 
to said stored log data 

32. /Vn article of manufacture comprising; 

a computer usable medium having computer 
readable program code means emtxxjied 
therein for causing statuses of ijiput and output 
(I/O) units to be restored to respective check- 
-. points when a computer system is restarted 
from occun'ence of fault, the computer reada- 
ble program code means in said article of man- 
ufacture comprising: 

computer readable program code means for 



12 
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causing a computer to save log data statuses 
upon setting statuses including an operation 
mode with respect to the I/O devices, and 
computer readable program code means for 
causing a computer to initialize the I/O devices s 
and set states of the 1/0 devices in accordance 
with the saved log data, upon occun'ence of a 
fault in the computer system. 

33. An article of manufacture corrprising: io 

a computer usable medium having computer 
readable program code . means embodied 
therein for causing statuses of input and output 
(1/0) units to be recovered to respective check- is 
points when a computer system is restarted 
from occunence of fault, the computer system 
having one or more CPUs, a main memory, and ■ 
one or more 1/0 devices and periodically sav- 
ing the internal state of said CPUs and the con- 20 
tents of said main memory as a checkpoint 
and the internal state of said CPUs and con- 
tents of said main menrory of the most recent 
checkpoint being restored when a fault occurs 
in said computer system to restart data zs 
processing, the computer readable program 
J code means in said article of manufacture com- 
prising: 

computer readable program code means for 
causing a computer to save log data states so 
upon setting statuses including an operation 
mode with respect to the I/O devices; 
computer readable program code means for 
causing a computer to initiaiize the I/O devices 
and set statuses of the I/O devices in accord- 35 
ance with the same log data, upon occurrence 
of a fault in the computer system; 
computer readable program code means for 
causing a computer to aeate, when a process 
in said computer system makes an I/O request, 40 
a request block in said main memory which 
contains information necessary to perform said 
I/O request; 

computer readable program code means for 
causing a computer to perform an I/O operation 45 
by accessing said I/O devices according to a 
request block; 

computer readable program code means for 
causing a computer to initialize, upon restart 
from the rrxjst recent checkpoint which follows so 
a fault occurrence, said I/O execution proc- 
esses in execution state and causing I/O oper- 
ations being performed by said I/O execution 
processes to be performed again -from the 
beginning. * 55 - 
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